-
Notifications
You must be signed in to change notification settings - Fork 675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added code to make pointer vector be thread safe #1325
Conversation
Kinda curious, is there a particular issue that requires the vector being thread safe? Also, if it isn't required all the time, the feature can be made opt-in by injecting the |
We had a requirement when we need to read the packet simultaneously for some kind of edge processing. Yes mutex may be overhead in non-thread environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@muthusaravanan3186 I believe we can use std::lock_guard
in this case which is simpler, no?
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## dev #1325 +/- ##
=======================================
Coverage 82.34% 82.35%
=======================================
Files 163 163
Lines 20937 20975 +38
Branches 7906 7910 +4
=======================================
+ Hits 17241 17273 +32
- Misses 3025 3030 +5
- Partials 671 672 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
I think @Dimi1010 has a good point. Maybe we can add an optional parameter in the constructor saying whether or not to use the mutex |
I think if we want to control using lock or not, maybe consider using |
adding an additional |
@seladb @tigercosmos I was thinking more along the lines of how spdlog solved it with their logging sinks. Ends up requiring a template but you don't really have any runtime overhead for the non-mutex version, and it is not a global switch of all or nothing like a preprocessor directive would be. Example from spdlog: |
I'm not sure implementing a |
Might be, might not. They have
whose The benefit over the if approach is that when you need a non-thread safe version, you don't need to construct and store a mutex anyway in the struct and it removes the need for ifs before every lock/unlock operation. It also allows you to actually use the I think the potential minor performance cost of a function call over an |
@Dimi1010 's proposal makes sense to me, especially readability and maintainability. |
are we fine with above implementation of using null mutex. If so i can go and make the necessary changes |
Yes, I think we can go with this approach! Thank you @Dimi1010 for jumping in and proposing it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments on some cases where the mutex use does not actually keep the structure thread safe.
Honestly, addressing those changes to keep the vector thread safe internally in all cases seems to me being more of a hassle then just having multithreaded code that access the vector simultaneously keep their own synchronization between them then relying on the vector implementation to do the sync. A vector class has too many ways to access/modify the underlying data.
VectorIterator begin() { return m_Vector.begin(); } | ||
VectorIterator begin() | ||
{ | ||
std::lock_guard<Mutex> lk(m_Mutex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this protects as much as you would think.
Yes, you don't get data races when obtaining the iterator but vector iterators generally modify the vector memory on their own, so after the safety car is gone (begin iterator is obtained), all the threads are free to race again via direct iterator modification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the case on normal scenario too even without multi threading environment. we get an iterator and after some time vector reconstructs itself then the above iterator will be undefined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, tho in a non-multithreaded scenario you don't have the option of another thread invalidating your iterator whenever. Iterators are invalidated in specific documented calls and the sequence of those calls in relation to iterator usage is deterministic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are planned to use shared_ptr in another PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we want to add this line (and all other similar ones) later when you raise another PR.
@muthusaravanan3186 would you like to finish this PR? |
@tigercosmos Our requirement was single producer and single consumer. But based on review comments it seems we need to address multiple consumer too. I just want to know what is the expectation here. If we need to change the PointerVector to hold shared ptr references; then it will be big change.. Please let me know what is expected here. |
@muthusaravanan3186 we can do it in phases. Make some changes discussed above without using shared pointers, and then in the future open another PR to move to shared pointers. The improvements suggested in this PR will make |
@seladb can you review now |
@muthusaravanan3186 could you resolve the conflicts? |
@seladb i resolved all the conflicts. But on one of the conflict i am getting the below message |
Not sure what did you do with the merge, but you can resolve the conflicts at your forked repo.
|
@muthusaravanan3186 you can look at this StackOverflow question: https://stackoverflow.com/questions/38949951/how-to-solve-merge-conflicts-across-forks Please notice that you should resolve conflicts between |
@seladb Resolved all the conflicts. |
Common++/header/PointerVector.h
Outdated
|
||
/** | ||
* @class PointerVector | ||
* A template class for representing a std::vector of pointers. Once (a pointer to) an element is added to this vector, | ||
* the element responsibility moves to the vector, meaning the PointerVector will free the object once it's removed from the vector | ||
* This class wraps std::vector and adds the capability of freeing objects once they're removed from it | ||
*/ | ||
template<typename T> | ||
template<typename T, typename Mutex=pcpp::nullMutex> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please update the document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename it to NullMutex
for consistency in this project. Also, maybe it's better to put NullMutex
in detail
namespace, since it is only for internal usage.
namespace detail
{
struct NullMutex { ... }
} // namespace detail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to NullMutex. But detail namespace is only present on Pcap++. Since this on Common folder hoping will place it here which is more reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tigercosmos where we need to update the document?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can update the comment, mentioning about the template argument. We use Doxygen to generate the document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Common++/header/PointerVector.h
Outdated
@@ -70,7 +83,7 @@ namespace pcpp | |||
throw; | |||
} | |||
} | |||
} | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: space
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Fix format, please ensure that you run
pre-commit
before submitting the code. You can also check if all CI works on your forked repo (Allow CI on the forked repo) - Remove all changes that you plan to use on the next PR, e.g.
begin()
,front()
. It doesn't make sense to have a WIP code here, please remove the unnecessary changes in this PR, and if you have time to raise another PR, we can have those changes back on that PR.
@tigercosmos There is no WIP PR yet; as discussed plan is to merge this PR and start to work on use of shared_ptr. |
I am okay If you want to add the support of shared_ptr, why not add it in this PR? template<typename T, typename Mutex=pcpp::NullMutex, typename PointerType=std::add_pointer_t<T> >
class PointerVector {
void pushBack(PointerType element)
{
std::lock_guard<Mutex> lk(m_Mutex);
m_Vector.push_back(element);
}
PointerType front()
{
std::lock_guard<Mutex> lk(m_Mutex);
return m_Vector.front();
}
} |
As discussed this was the plan @muthusaravanan3186 we can do it in phases. Make some changes discussed above without using shared pointers, and then in the future open another PR to move to shared pointers. The improvements suggested in this PR will make PointerVector mostly thread safe (even if not 100%). We can mention the edge cases in the doxygen documentation |
Then I oppose adding unnecessary locks for the future support of the shared pointer, because (1) those locks are useless for |
@tigercosmos Back to square one. we have null mutex for normal case and mutex for multi-threading case. This was discussed. Since it is vector of pointers and we are accessing the pointers which itself is not thread safe. Since changing Pointervector to handle shared Pointers is a huge change; we planned to implement in phases. If we need everything at single shot it may take time. Please let me know what is the expectation. If not will drop this PR and work on something else. We already modified PCAPPLUSPLUS according to our need. |
I cannot make decision for this, let's wait for @seladb |
@muthusaravanan3186 @tigercosmos @Dimi1010 I'm trying to catch up here... I think we need to decide on 2 issues, please correct me if I'm wrong:
Which issue(s) are you concerned about and is blocking this PR? Regarding (1) - do you have any (simple) suggestion on how to solve it? |
@seladb @tigercosmos @muthusaravanan3186 Tbh, I don't really see an easy solution to (1) while keeping the current API of the class. The two ways I could see iteration being done both have problems:
Regarding point 2. Using With all that said, I am personally of the opinion that the burden of thread safety should not be placed on the vector object itself, but on the program that requires the vector data to be accessed by multiple threads simultaneously. My reasoning is that the vector and raw arrays in general provide way too many ways to access or mutate the data to be easily secured, while the program attempting to access the data from different threads would have a far better knowledge of how the data is being accessed to provide thread-safe data access and avoiding race conditions. A simpler solution would be having the different threads make a local copy of the vector contents while under synchronization and then proceeding independently using their local copies. This type of solution minimizes the points of contention where sync lock needs to be acquired between the threads, as the alternative would require a sync lock on every element access or mutation. It also removes the need of the underlying objects that are being held by the array being thread-safe too. |
After reading @Dimi1010 's response I started wondering what is the use case of a thread-safe pointer vector if both (1) and (2) are not solved... In general, a thread-safe container is only needed when one or more threads are changing the vector, usually while other threads are reading data from it:
@muthusaravanan3186 can you share your use-case for thread-safe pointer vector? It might help us decide if we actually want to implement it or not |
@muthusaravanan3186 @Dimi1010 @tigercosmos let's decide if we want to keep this PR or close it? |
@seladb sorry for late reply.. let's close this PR. Let me if you have anything to work.. will start to work on that |
IMO, the thread-safe issue is interesting. Perhaps put this to issue and keep tracking? |
@tigercosmos there was an issue for it: #234 but we closed it. We can reopen it if needed, but the real question is whether we want to address it, and if so - how? |
@muthusaravanan3186 thank you for being willing to help! You can choose any issue from the issues list or suggest an improvement of your own. If you're looking for "quick wins" you can work on these issues: Or if you want to implement a new protocol: |
I vote close the PR. I don't believe the vector itself is the correct location for thread safety to be located as I explained in my earlier reasoning. |
sure, let's close it. |
Added code to make pointer vector be thread safe. Used mutex to lock every operation done on the pointer vector.